Skip to content

feat(evaluators): new api from runners-api#242

Open
namrataghadi-galileo wants to merge 10 commits into
mainfrom
feature/67157-refactor-api-runner-api
Open

feat(evaluators): new api from runners-api#242
namrataghadi-galileo wants to merge 10 commits into
mainfrom
feature/67157-refactor-api-runner-api

Conversation

@namrataghadi-galileo

@namrataghadi-galileo namrataghadi-galileo commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Refactored Galileo Luna invocation so Agent Control calls runners-api directly at /api/v1/scorers/invoke.
  • Switched Luna auth to internal JWT only, signed with GALILEO_API_SECRET_KEY or GALILEO_API_SECRET.
  • Made scorer_id the required runtime identity; scorer_label is optional metadata and scorer_version_id is optional pinning.
  • Ensured requests always include config: {} and never forward Galileo-API-Key to runners-api.

Scope

  • User-facing/API changes:

    • Luna configs now require scorer_id; label-only and version-only configs are rejected.
    • Galileo Luna examples/docs now document GALILEO_RUNNERS_API_URL, GALILEO_LUNA_SCORER_ID, and internal secret auth.
  • Internal changes:

    • Replaced API /scorers/invoke and /internal/scorers/invoke client routing with runners-api routing.
    • Removed Luna auth mode selection and API URL/console URL derivation.
    • Added tests for endpoint, JWT payload, scorer ID validation, version forwarding, API-key stripping, and request body shape.
  • Out of scope:

    • Agent Control UI changes.
    • runners-api/API service implementation changes.
    • Scorer hydration, version resolution, caching, retries, or protect execution logic inside Agent Control.

Risk and Rollout

  • Compatibility note: galileo.luna now requires scorer_id. scorer_label is optional display/metadata and scorer_version_id is an optional version pin. Any saved Luna controls that only have scorer_label or scorer_version_id must be re-saved through the UI or manually updated to include the resolved scorer_id before they can be evaluated with this version. The enterprise UI already resolves the selected scorer and saves scorer_id.
  • Risk level: medium
  • Rollback plan:
    • Revert this PR to restore the previous API-mediated Luna invocation path.
    • If rollout exposes saved controls without scorer_id, migrate those controls or temporarily roll back before retrying cutover.

Testing

  • Added or updated automated tests
  • Ran make check (targeted package validation was run instead)
  • Manually verified behavior

Targeted validation:

  • make -C evaluators/contrib/galileo test -> 86 passed
  • make -C evaluators/contrib/galileo lint -> passed
  • make -C evaluators/contrib/galileo typecheck -> passed
  • git diff --check -> clean

Checklist

  • Linked issue/spec: RFC on direct runners-api scorer invoke
  • Updated docs/examples for user-facing changes
  • Included required follow-up tasks: confirm/migrate any legacy saved Luna controls missing scorer_id

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@namrataghadi-galileo namrataghadi-galileo changed the title feat: new api from runners-api feat(evaluators): new api from runners-api Jun 16, 2026

scorer_label: str | None = Field(
default=None,
scorer_id: str = Field(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This backend schema now requires scorer_id, but the Luna UI still accepts label-only/version-only configs and sends scorer_id: null. That creates a save-time validation error. Can we update the UI validation/copy to require scorer_id, or keep the backend one-of identifier contract?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhinav-galileo Enterprise UI fetches the scorers using /v1/scorers?scorer_type=preset&scorer_type=luna . This brings along scorer_id and scorer_version_id. So UI resolves the selected Luna scorer before saving. The visible field in UI is label-oriented, i.e. we show the scorer label on the screen, but selecting it populates hidden scorer_id and scorer_version_id, so Agent Control receives the canonical scorer ID.

Given that, I think we should keep the backend contract as scorer_id required and not restore label-only/version-only support in Agent Control. Accepting label-only in AC would require AC to perform scorer lookup/resolution itself.

configuration to the direct Luna scorer fields (`scorer_label`, `scorer_id`, or
`scorer_version_id`, plus `threshold` and `operator`). If you still need the
legacy Luna2 evaluator, pin `agent-control-evaluator-galileo <8`.
configuration to use the direct Luna scorer fields. `scorer_id` is required;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since scorer_id is now required, can we confirm the rollout path for any saved galileo.luna controls that only have scorer_label or scorer_version_id? A short migration/compat note here or in the PR description would make the break explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I’ll make this explicit in the PR notes.

Rollout path: this is a breaking config contract for galileo.luna. Saved Luna controls must include scorer_id; scorer_label is now display/metadata only, and scorer_version_id is an optional version pin. The enterprise UI already resolves the selected scorer before save and persists scorer_id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants